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Abstract 

Consider a single-stage problem in which we have a group N agents who are attempting 
to minimize the expected cost of their joint actions, without the benefit of communication 
or a pre-established protocol but with complete knowledge of the expected cost of any joint 
set of actions for the group. We call this situation a static coordination problem. The 
central issue in defining an appropriate solution concept for static coordination problems 
is considering how to deal with the fact that if the agents are faced with a set of multiple 
(mixed) strategies that are equally attractive in terms of cost, a failure of coordination may 
lead to an expected cost value that is worse than that of any of the strategies in the set. 
In this proposal, we describe the notion of a general coordination problem, describe initial 
efforts at developing a solution concept for static coordination problems, and then outline 
a research agenda that centers on activities that will be basis for obtaining a complete 
understanding of solutions to static coordination problems. 



2 


1 Overview 

This document serves as the final report for the research conducted under grant NAG- 1-03059. 
This research centers on the development of a solution concept for a problem in distributed 
decision making that may be posed as follows: There are M agents, each of which has an 
associated set of control actions that it may take. Each agent knows the actions available to 
every agent. All of the agents must simultaneously choose an action, and the M-tuple of chosen 
actions determines systems performance in a way that is known precisely to each of the agents 
before the actions are selected. The agents may not co mmuni cate information about the actions 
that they will take, though it is known by all that each will seek to maximize system performance. 

There are three primary features that make this problem a special or limiting case of the 
general problem of distributed decision-making: 

1. The agents share a common objective of maximizing some measure of system performance. 

2. The agents each know how to predict system performance with certainty, given the actions 
selected by all the agents. Equivalently, we may say that the agents share a common model 
of system performance. 

3. No communication between agents to coordinate actions is allowed. 

This problem poses a significant, non-computational challenge only when there exists more than 
one set of control actions that optimizes systems performance. 

In this report, we formulate the notion of a general coordination problem and then describe 
the solution concept for static coordination problems that constitutes a primary outcome of the 
research under grant NAG-1-03059. 

2 Problem Statement 

To motivate the notion of coordination processes (as fully developed in following sections), let 
us consider a simple example involving two people (A! and Betty) engaged in a telephone con- 
versation. Suppose that in the middle of the conversation the line is cut and that both A1 and 
Betty wish to resume their conversation as quickly as possible, but who should call whom? If 
both participants employ the same amount of time either picking up and dialing or waiting for 
the return call, then there might be a problem: 

1. If both A1 and Betty redial, then both will receive busy signals and the line will not be 
reconnected. 

2. If both A1 and Betty wait for the return call, then again the line will not be reconnected. 

3. Only if one party or the other places the call will the connection be made. 
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Thus, unless A1 and Betty established in advance a protocol for who must call back, they must 
play out a sequence of actions (dialing and/or waiting) for the connection to be remade. What 
complicates the problem is that A1 and Betty do not have the opportunity to communicate with 
one another in order to establish the best way to proceed; the inability to communicate is the 
essence of the problem. We refer to the dynamic process of reestablishing their connection as a 
coordination process. 

Even though the Dial- Wait problem is a very simple example, it exposes a number of inter- 
esting issues. First, observe that if either player arbitrarily decides to implement a determin- 
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“D v corresponds to “dial”, then the other player could have also selected the same sequence 
{W, £), D, W, D,W,.. .}, which would result in the connection never being reestablished. While 
it seems unlikely that A1 and Betty would make the same arbitrary choice, this is an example of 
the worst case outcome, and, without knowing the mechanism by which the other agent will se- 
lect actions, it seems that we are stuck with this worst case as the only way to evaluate arbitrary 
decisions. 

In contrast, randomized strategies make a lot of sense. For example, if A1 implements a 
strategy of dialing with probability p € [0, 1] at each dial-wait opportunity, independently of his 
or Betty’s actions at previous stages, and if Betty similarly dials with probability q E [0, 1 ] at 
each stage, then the coordination process reduces to a Markov chain, and the expected number 
of stages until the connection is reestablished works out to be [p( 1 — q) + (1 — p)g] -1 , plotted in 
Figure 1. Note that if A1 and Betty choose p = 0 and q — 1, respectively, then they reconnect 
in one stage, the best possible outcome. What complicates matters is that another solution 
achieves the same result, namely p = 1 and q = 0, and if A1 and Betty can’t agree on which one 
of these two solutions to pick, then they wind up achieving the worst possible outcome, either 
(p,q) = (0,0) or (p, q) = ( 1 , 1 ), for which the connection is never reestablished. 

Analyzing this example in game-theoretic terms, we observe that both of the best-case solutions 
(p, q) = ( 1 , 0) and (p, q) = (0, 1) are Nash equilibria in the common interest game defined by the 
cost (disutility) function /(p, q) = [p(l — q)+ (1 — p)q]~ 1 . (Neither player can deviate unilaterally 
to achieve a lower value of cost.) There is a third Nash equilibrium solution where each player 
dials with orobabilitv one half so that ( regardless of what action the other nla^er chooses) the 
per-stage probability of reestablishing the connection is one-half, and the expected number of 
stages to terminate the process is two. Mathematically, /(.5, q) = 2 = /(p, .5) for all p, q E [0, 1 ], 
which also shows that (p, q) = (.5, .5) is a saddle-point (minimax) solution for the game. 

Considering that there are multiple Nash equilibrium solutions, game theory doesn’t offer 
much insight. Given that A1 and Betty will only play actions that are consistent with Nash 
equilibrium solutions, then which equilibrium should they choose? Should they choose one of the 
Nash equilibria with lowest expected cost? If so, which one? (Again, if A1 and Betty disagree 
on this point then they experience the worst case outcome.) Should A1 and Betty settle on the 
mixed strategy equilibrium (p. q) = (.5, .5)? In what sense would this be rational? Is there 
something more to the solution (p, q) = (.5, .5) than the fact that it is a Nash equilibrium in 
the common interest game defined by /(p, 9 )? (If so, then the problem goes beyond one of 
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Figure 1: Expected number of stages to reestablish connection (note log-scale on the z-axis), 
where p is the per-stage probability that A1 dials and q is the per-stage probability that Betty 
dials 

“equilibrium selection.” ) 

Concepts from game theory are generally not strongly useful for distributed decision-making 
without coordination. Notions of equilibria in games are founded on the idea that the game will 
be played many times, with the players gaining some information each time. Under such an 
assumption, it is not critical to define a computational procedure for finding equilibrium policies. 
Indeed, much of the literature in game theory focuses on defining concepts of equilibria and 
establishing the conditions under which they exist. In our problem, however, we must adopt an 
algorithmic approach, defining a reasonable concept of a solution to the problem and providing 
an algorithm by which this solution can be found. In this sense, we propose to treat distrib- 
uted decision-making without coordination much more like a decision theoretic or optimization 
problem than a formal game. 


3 General Coordination Processes 

We generalize the dial-wait example by defining the notion of coordination processes as follows. A 
coordination process is a discrete-stage dynamic process, similar to a Markov control process [8] 
and the dynamic team model of [10], whose state is denoted by x 6 X, where A is a Borel 
space representing the set of all possible operating conditions (the state space ) associated with 
the coordination process. For each state x € X there is an associated set U(x) — {1,2, ...,N X } 
of actors whose actions jointly determine the outcome of the process per stage, where N x is 
the (possibly infinite) number of actors associated with state x € X. Each actor i 6 U(x) 
has an associated set of actions ( action set ) Afix), where A t (x) is a Borel space. A feasible 
action profile a = (oi, . .. , a Nx ) e Afx) = A(x), represents a collection of actions selected 
independently by the actors U(x) when the coordination process is in state x € X. We assume 
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that the set K = {(x,a) | x E X, a € A(x)} is a Borel measurable subset of X x A. In jointly 
(and independently) selecting a feasible action profile a E A(x) the actors collectively determine 
probability distribution for the next state of the coordination process. Let Q denote the stochastic 
kernel on X given K that serves as the state transition law of the coordination process. The actors’ 
joint selection of actions in a feasible action profiles also determines the probability distribution 
F^ ,a ' ) for the random system-level cost associated with the state transition, where (x, a) € K. 
For convenience, let c x (a) denote the expected cost associated with state transitions from x E X 
under the feasible action profile a E A(x). We assume that all actors have perfect knowledge of 
the state transition law Q, whereas, depending upon the application, we may or may not assume 
that the transition cost distribution function is known. 

Coordination process evolves in discrete time, starting from an initial state x° E X. Let 
H° — {x 0 } denote the initial history of the process. Each actor i E U (x°) observes a (possibly set- 
valued) function of the initial history and upon making this observation independently 

chooses an action a° E A, (x° ) . Let a 0 — (af, ...,a° No ) E A(x°) represent aggregation of all 
actions selected by the actors in f/(x°). Then, according to the state transition law Q and the 
transition-cost distribution function ,Q 5 , the coordination process transitions to a new state 
x 1 E X and experiences a transition cost c°. Let H 1 = {x°, a 0 , c°, x 1 } denote the history of the 
process at stage 1. The process continued similarly over some (possibly random and/or infinite) 
time horizon T. That is, at stage t = 1, ..., T — 1: 

1. Given the history H 1 , each actor i EU (x 4 ) observes a (possibly set-valued) function of the 
initial history ^(/f 4 ), and independently chooses an action a\ E A l (x t ). 

2. Under the feasible profile of = (a°, . . . , a° N t ) € A(x°), the system transitions to the next 
state x 4+1 € X and experiences a transition cost c 4 according to the state transition law Q 
and the transition-cost distribution function F^ \ with H t+1 = FF U {a 4 , c 4 , x 4+1 }. 

Example 1 (Dial- Wait Revisited) The Dial- Wait example can be seen to be a coordination 
process in which there are two states X = {1,G}, with two actors per state 17(1) = U (0) = 
{1,2}, each actor being one of the two parties in the call. State x = 1 corresponds to the 
“disconnected” state, where the two callers are still trying reestablish their connection, and state 
x = Q corresponds to “ connected ” state, where the two callers have finally managed to establish 
there connection. While in state x = 1 both actors i — 1, 2 have two pure actions available 
Aj(l) = {1,2}, where a* = 1 corresponds to the decision to dial and a* = 2 corresponds to the 
decision to wait. Allowing both actors to randomize their decisions we have Aj(l) = [0,1], where 
a, corresponds to the probability that a* = 1, * = 1,2, and the probability of transitioning from 
x = 1 to is then ai(l — a 2 ) + (1 — ai)a 2 . Since both actors seek to be reconnected as quickly 
as possible, they perceive (deterministically) a system-level transition cost c(l) = 1 for all state 
transitions from x = 1, including self -transitions. The coordination process terminates as soon 
as it transitions into x — Ft, so that the total cost associated with reconnecting is Ylt=o c(l) = T, 
where T is the random number of stages of dialing/waiting until reconnecting. 
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While the Dial- Wait example is very simple, the mathematical framework for coordination 
processes above is quite general, having much of the same structure as that for Markovian control 
processes [6, 4, 2, 3, 8, 9]. Of course, coordination processes are more general since they allow for 
multiple decision makers (i.e. actors) who interact through their actions as time evolves. What 
distinguishes our framework from earlier work on dynamic noncooperative games [1] is that we 
make the extreme simplifying assumption that all players have the same objective: to minimize 
a single system-level notion of cost. It is this special structure that provides the opportunity for 
advances above and beyond the co nfin es of game theory as known today, and because of this we 
refer to the decision makers in our model as “actors” rather than as players. 

Despite assuming that all actors have the same preference structure, the general class of 
coordination processes above is too unstructured to allow for much progress (either analytically 
or computationally). Fortunately, at least for the applications that we have in mind, it is possible 
to identify a simpler model whose structure can be exploited in defining new solution concepts 
and in characterizing and computing optimal coordination strategies. In robotic applications, for 
example, it makes sense to assume that the set of actors U ( x ) is fixed, i.e. not dependent on the 
state of the process. Also, for the planning-type applications, it makes sense to assume that the 
state space X and the sets of pure actions available to each actor are all finite. Thus, we propose 
to focus attention on coordination processes that satisfy the following structural assumption: 


Assumption 1 The following are true. 


1. The state space X is finite. 


2. The same set of actors applies at all states, and this set is finite. That is, U(x) = U = 
{1,2, ... , N} for all x € X, where N denotes the (constant) number of actors. 

3. Each actor i € U has a finite set of pure actions A*(x) and chooses mixed actions over 
Ai(x) as the coordination process evolves, i.e. 


Ai(x ) = 




*i,3 


> 0, j € Ai(x), and 

jCAi{x) 


y 

4 = 1 


The coordination process evolves as a sequence of each actor i choosing “mixed actions” 
at € Ai(x), where the decision at each stage is what probability distribution over Ai should 
be played. 

4- For each pair of states x and x in X and for each profile of pure actions a — (5i, . . . am) 6 
n^jA^x), there is a corresponding state transition probability p X x{oi), representing the 
probability of transitioning from x to x under the pure actions a. Let p xx {ct) denote the 
resulting probability of transitioning from x to x under the profile of (mixed) actions a € A. 


Note that Assumption 1 essentially restricts attention to coordination processes that have same 
basic structure as Markov decision processes [11] and (zero-sum) competitive Markovian decision 
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processes [7]. We plan to focus attention on a class that we term transient coordination processes. 
These are coordination processes in which each actor must identify an appropriate (mixed) action 
for each state of the process so as to reach an absorbing and cost-free state along a minim u m 
cost path, as in the Dial- Wait example. Problems of this type have very much the character 
of so-called transient competitive Markovian decision processes [7], although, because of our 
assumption about a common objective function, we find it convenient to introduce a new solution 
concept, namely minimum unambiguous value (MUV) solutions, to address the inadequacy of 
Nash equilibria as observed in the Dial- Wait example. 


4 Transient Coordination Processes 


In this section, we focus on coordination processes with the property that all actors seek to 
drive an underlying system to an absorbing, zero-cost state Q along a minimum cost trajectory 
through the state space. 


Definition 1 A transient coordination process is a coordination process in which all actors seek 
to minimize the expected discounted cost associated with the evolution of the system, 


{ T 

< lim sup 7 <c< 

l t= o 


actor decisions 


}■ 


(1) 


where 7 € [0, 1] is a discount factor, subject to Assumption 1 and the following additional as- 
sumptions. 


1. The coordination process has an absorbing, zero- cost state D € X. That is, there exists a 
state Q such that ptm(a) = 1 and cn(a) = 0 for all a £ .4(D). 

2. Each actor has perfect knowledge of the distribution function Fq , 0c \ 

3. In selecting an action at each stage, each actor has knowledge only of ( or restricts attention 
to) the current state of the system x l 2 3 at each stage of the process, i.e. C , '?(#*) = x l for all 

t = 0,1 ,.... 


Some examples of transient coordination processes follow. 


Static Coordination Problems Consider the situation where a given set of actors U are 
engaged in a single (aggregate) decision making over a set A of feasible action profiles, where 
the outcome of the process is a random cost C whose distribution function Fg is determined by 
the feasible action profile a £ A selected by the actors. Static coordination problems of this type 
can be interpreted as transient coordination processes involving two states X = {1 , 0}, where 
(i) the system starts out in state x = 1 (with .4(1) = A and F^’ a ^ = Fg) and (ii) the system 
transitions immediately to the terminal state ft, which is absorbing and cost free. 
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Finite-Horizon Coordination Processes Finite-state, finite-horizon processes can similarly 
be expressed as transient coordination processes. Suppose a given set of actors U make decisions 
over a predetermined finite time horizon T, where X n denotes the set of all possible states of the 
coordination process at stage n = 0, 1, . . . , T— 1. In this case, the set X = XoU.X’iU- ■ -UXr-iUlD} 
can be interpreted as the state space of an equivalent transient coordination process, where the 
transition probability matrix is such that only transitions from X n to X„ + i for n = 0, 1, . . . , T — 2 
and from Xt- i to {fi} have nonzero probability. 

Discounted Cost Coordination Processes Infinite-horizon, finite-state processes with a 
discounted cost criterion can also be expressed as transient coordination processes. Suppose a 
given set of actors U make decisions on an infini te time horizon, where 

1. the finite set X denotes the state space associated with each stage of the process, 

2. Pxx(ct) is the probability of transitioning from x € X to x €. X under the feasible action 
profile a , and 

3. all actors seek to minimize the expected discounted cost objective of Equation (1) with 

7 < 1. 

Using a well-known trick from the theory of Markov decision processes, we can define an equiv- 
alent transient coordination process that evolves over the state space I = IU {12}, where Q is 
a cost-free absorbing state, by adjusting the transition probability matrix so that 1 — 7 is the 
probability of transitioning to fi from any state x € X and transitions from x € X to x € X 
occur with probability 7 • p x ±(a). In this way, the effect of the discount factor shows up as the 
per-stage probability of not terminating in the equivalent transient model. 

Stochastic Shortest Path Coordination Processes A quite general class of transient co- 
ordination processes are defined by the undiscounted stochastic shortest path assumptions of [5]. 
In this case, one can refine Definition 1 by assuming (additionally) that (i) at least one actor 
u € U has the ability to guarantee termination of the process (i.e. that the system wij] eventually 
transition to S2) with probability one and (ii) whenever the actors behave in such a way that 
there is a chance that the system never terminates, then the expected total cost of the process 
is infinite. 

In the research under this grant, we have restricted attention to static coordination problems. 
A proposed solution concept for this class problems is described below. 

5 Minimum Unambiguous Value 

The central issue in defining an appropriate solution concept is considering how to deal with the 
fact that if the agents are faced with a set of multiple (mixed) action profiles that are equally 
attractive in terms of cost, a failure of coordination may lead to an expected cost value that is 
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worse than that of the profiles in the set. Below we propose a solution concept that is grounded 
in the idea that any optimal action profile for a coordination problem should be unambiguous 
in the sense of not being subject to degradation if any of the agents elect to take an alternative 
action that is equally attractive in terms of cost. 

To formalize this notion, we first define some notation. Given two mixed action profiles, 
C° = (x®, x°, - . . , x° N ) and C 1 = (x\, x \, . . . ,x x N ), the worst case confusion that can arise between 
C° and C 1 is 


<Ke,e) 


max 




( 2 ) 


Given a mixed action profile C = (x 1 ,x 2 , . . . , x N ), define 1 


$(C) = sup <KC 0 ,C)- 

e ■■ m°)=m 


(3) 


Note that $(C) > /(C) for all C since o(Ci C) = /(C)- We say that a mixed action profile £ has an 
ambiguous value ceiling if $(£) = £. In other w r ords, if £ has an ambiguous value ceiling, then 
there is no risk that a failure to coordinate in choice of equally attractive profiles will lead to 
an objective value worse than that of C 1 - If C does not have an ambiguous value ceiling, i.e. if 
<b(C) > /(C), then we say that C is confusable. 

In terms of the above definition, we assert that one can cast the problem faced by the agents 
as being one of finding a solution that has smallest expected cost among all solutions that have 
an ambiguous value ceiling, a point which we call a solution of minimum unambiguous value , or 
MUV for short. 


Definition 2 (MUV) We say that an action profile C 1 is MUV if (i) C 1 has an ambiguous value 
ceiling and (ii) all C such that /(C) < /(C 1 ) are confusable. 

Example 2 (Dial- Wait Revisited) Referring back to the Dial- Wait example of Section 2 , we 
observe that three mixed strategy profiles have an ambiguous value ceiling: 

1 . The solution where both Al and Betty dial with probability one, i.e. Xi^ ia i — pi = 1 and 
X2,diai = P2 = I, has an ambiguous value ceiling of one. 

2 . The solution where both Al and Betty wait with probability zero, i.e. Xi^ua = Pi — 0 and 
X2,diai = P2 = 0, also has an ambiguous value ceiling of one. 

3 . The solution where both Al and Betty dial independently with probability one half, i.e. 
X\4iai = pi = .5 and x^diai — Pz — -5, has ambiguous value ceiling of one half. 

The last solution (i.e. Xi^iai = Pi = -5 and 124101 = P2 = . 5 ) is such that any attempt to achieve a 
lower expected value, namely lower than one half, can be confused with another solution achieving 
the same value resulting in greater expected cost. Thus, {P11P2) = (.5, .5) is the MUV solution to 
the Dial- Wait problem. 


'In Section 7.3 we propose an alternative definition in which the supremum is over all £° such that /(£°) < /(C)- 
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In general, if a unique MUV solution f exists, then we may take £ as the solution to the 
identical interest game defined by /. Indeed, the MUV solution concept seems to adequately 
capture the essence of rational decision making under the axiom of no arbitrary action. However, 
there are still many aspects of the concept that need to addressed. 

Open Problem 1 Iff; 1 is MUV, and iff, is such that /(£) < /(£*), is it true that 3>(£) > f(f l ) ? 

In addition, we need to understand the conditions under which MUV solutions exist. If 
existence is not generally guaranteed (as it may not be, since the sets of solutions that have 
ambiguous value ceilings may not have nice topological properties), then it may be useful to 
consider notions of approximate (or e-) MUV. 

Open Problem 2 Under what conditions will a MUV or e-MUV solution exist? 

We also need to understand the relationships, if any, between MUV solutions and the various 
notions of equilibria that have been defined for games and other related problems. Are the 
agents facing a problem of equilibrium selection, or something quite different that will require 
the development of a new algorithm? The later case is more likely and it exposes the need to 
understand the computational complexity of the problem. This is non-trivial task, since it is 
not apparent how to formulate the task of finding a MUV point as a standard computational or 
optimization problem. It is likely that the problem is NP-hard and we will need to turn to the 
development of reasonable heuristics or approximation algorithms. This would be an acceptable 
outcome; indeed our primary purpose in studying the theory of transient coordination processes 
is to develop insight that can be the basis for the design of effective heuristics for real-world 
applications. 


5.1 Alternative Characterizations of MUV 

In this section we consider some alternative characterizations of MUV that may prove useful in 
defining a computational approach to solving static coordination processes. 


5.1.1 Alternative 1 

For an alternative characterization of MUV, it is convenient to define 


v * =MV = 



there exists f 1 such that 

1. /(£*) = v and 

2. ^(e 1 ) = m 1 ) 


Note that the set V C 3? is non-empty. For example, it contains v max = 


| (4) 

max <ll) a 2v ..,a_ 1 v f (^1 > ^2, • • - i )• 


Open Problem 3 How do we characterize V? Is it a finite set? Does it have a minimum 
value ? 
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To establish a connection with Definition 2, we observe the following. 

Lemma 1 A mixed strategy profile £ is MUV if and only if 

f(0 = v* and S(0 = u*, 
with v * as defined in Equation (4)- 

Note that if £ is not the unique MUV solution, then we have the following. 

Lemma 2 If 0 and 0 are both MUV, then < v * 

In other words, if £* and £ 2 are both MUV (achieveing expected cost w*), then any profile £ 3 
constructed as a permutation of 0 and 0 cannot have cost greater than v*. 

Open Problem 4 Can the inequality in Lemma 2 be strict? 

Open Problem 5 Given two MUV solutions . . . ,Xy) and 0 = (x 2 ,x 2 , . . . ,x%), 

and defining 0 = (x 3 ,x 3 , . . . ,x%) = (x^ 1 , x* 2 , . . . , x*?) with arbitrary permutation (k\, fc 2l • - - > k N ) € 
{1,2} a , is it true in general that 0(£°,£ 3 ) < v* for all 0 such that /(f°) = v* . 

If the the answer to the open problem above is “yes” for all N, then f 3 is such that confusing it 
with any other profile £° such that /(£°) = v* results in cost less than or equal to v*. 

Open Problem 6 Suppose is such that = /(£ J ) and suppose 0 is such that <h(£ 2 ) = 
/(£ 2 ) < /(£ J ), ™ it true that </>(£\£ 2 ) < fit 1 )? 

5.1.2 Alternative 2 

Given v € 5?, define 

*(«>) = jnf $(£)• ( 5 ) 

Now define 

n* = inf V = { v | v — ^(u) } . (6) 

Conjecture 1 It is true that v* — v*, in which case MUV corresponds to a minimal fixed point 
of 

6 Evaluating MUV Solutions 

The concepts that we have considered up to this point do not by themselves suggest a method 
for actually computing a MUV solution. In general this seems to be a very hard problem, and 
well outside the scope of this initial research effort. Some special cases, however, are tractable 
as discussed below. 
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6.1 Two-Player, Two- Action Games 

Consider the static coordination problem defined by the matrix 

/(ai,i)ffl2,i) /(ai, 1,02,2) 
f( a 1,2,02,1) /(o 1,2,02,2) _ 

where the set of pure actions available to Player 1 is Aj = {a lvl , a 12 } and the set of pure actions 
available to Player 2 is A 2 = {02,1, o 2 , 2 }- To simplify the notation a bit, we will use f tJ to refer 
to /(oi,i, a 2j ), the cost of the action profile oi,*, a 2j , for i,j € {1, 2}. We parameterize Player l’s 
mixed strategies as = (p x , (1 - pi)) and Player 2’s strategies as x 2 = (p 2 , (1 — P2), where pi is 
the probability that Player 1 chooses aij G Ai and P 2 is the probability that Player 2 chooses 

02.1 € A 2 . With this parameterization in mind, we can express f(x i,x 2 ) as 

V (j>l , P2 ) =PlP2/ll + (1 -pi)Pihl +Pl(l ~Pl)f\2 + (1 -Pl)(l -p2)/22- (7) 

To make this problem interesting, let us assume that /n = k = / 2 2 and /12, /21 > fc. If the 
former inequalities are strict, then there are two conflicting optimal solutions to the problem, 
and it becomes interesting to consider the MUV solution to the problem. 



Lemma 3 Assuming that fa = k = / 22 and fa, /21 > k, with fa + /21 > 2k, the mixed action 
pair parameterized by 

(/21 - k ) . (/12 - fc) 


Pi = 


(/12 + /21 — 2 k) ^ 2 (/12 + /21 — 2 k) 

is a MUV solution to the static coordination game defined by F, and 

V{Pl ’ P2 > + (f„ + M - 2 k)' (9) 

Proof: By hypothesis, we have that 

V {jpi , p 2 ) = k + (/12 - &)pi — (fa + /21 — 2k)pip2 ■+ (/21 — k)p2- 
Plugging in p 2 we have 

V(Pi,P* 2 ) = + [(/12 ~ k) — (/12 + /21 — 2k)p2\pi + (/21 — k)p 2 

= fe + (/21 - k)p 2 

, (fa-k)(fa-k) 

(fa+f21~2k) 

for all pi G [0, 1]. Similarly, given that Player 1 adopts the mixed strategy p\, we have 

T „.. ^ , (fx2-k)(fa-k) 

(/l2 + /21-2fc) 

for all P2 G [0, 1] . Thus, the mixed action pair parameterized by p* x and p 2 has an ambiguous value 
ceding. Moreover, it can be verified that any pi,p 2 such that V(pi,p2) < V(p\,p 2 ) is confusable. 
Q.E.D. 


(8) 
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6.2 Symmetric Problems 

For static coordination problems that are sy mm etric in the sense that 

f(x i,X 2 ,...,Xn) = f(x p m,Xp( 2 ),..-,Xp(N)) (10) 

for any permutation {p(l),p(2), . . . ,p{N)} of {1,2,..., IV}, it is possible to infer a property of 
the MUV solution. 


2 a ... o r m. _ * fTTi 7 „ _ i~.x: x „ 

L ± nc ivi \j v dOlUliOTl vu 

arg min f(x,x, . . . ,x). 


the symmetric problem is (x*,x*, 


,x*), 


7 Alternative Solution Concepts 

In this section we collect a number of solution concepts whose relationship to MUV have yet to 
be determined precisely. 


7.1 MUV+ 

In case the answer to Open Problem 1 turns out to be “No,” then we could consider an alternative 
version of the MUV solution concept, as follows. 

Definition 3 (MUV+) We say that an action profile f 1 is MUV+ if (i) f 1 has an ambiguous 
value ceiling and (ii) all £ such that /(£) < /(£*) are such that d>(() > /(£’ ). 

Open Problem 7 The existence, uniqueness, implications of non-uniqueness, and characteri- 
zation of MUV+ solutions all need to be established. 


7.2 Minimax Confusion 


If w6 are willing to give up ou the requirement that a “solution” to the identical luteiest game 
must have an ambiguous value ceiling, then we may consider yet another alternative to MUV, 
as follows. Define 


M* 


inf 


*(?) 


( 11 ) 


Definition 4 If achieves the infimum in Equation (11), then we say it minimizes confusion 
in the identical interest game definined by f. 


Open Problem 8 The existence, uniqueness, implications of non-uniqueness, and characteri- 
zation of minimax confusion solutions all need to be established. 
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7.3 Revised notion of “ambiguous value ceiling” 

Define 

$(£)= sup <£(f°,0, (12) 

c° : m°)<m 

and replace $ with i in all earlier definitions, particularly in the definition of “ambiguous value 
ceiling.” By expanding the domain of the supremum in the definition, we do not signficantly 
change our internalization of the MUV concept. However, considering the final remark of Open 
Problem 6, this may actually be the best definition. 
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